591 research outputs found

    From smart to deep: Robust activity recognition on smartwatches using deep learning

    Get PDF
    The use of deep learning for the activity recognition performed by wearables, such as smartwatches, is an understudied problem. To advance current understanding in this area, we perform a smartwatch-centric investigation of activity recognition under one of the most popular deep learning methods - Restricted Boltzmann Machines (RBM). This study includes a variety of typical behavior and context recognition tasks related to smartwatches (such as transportation mode, physical activities and indoor/outdoor detection) to which RBMs have previously never been applied. Our findings indicate that even a relatively simple RBM-based activity recognition pipeline is able to outperform a wide-range of common modeling alternatives for all tested activity classes. However, usage of deep models is also often accompanied by resource consumption that is unacceptably high for constrained devices like watches. Therefore, we complement this result with a study of the overhead of specifically RBM-based activity models on representative smartwatch hardware (the Snapdragon 400 SoC, present in many commercial smartwatches). These results show, contrary to expectation, RBM models for activity recognition have acceptable levels of resource use for smartwatch-class hardware already on the market. Collectively, these two experimental results make a strong case for more widespread adoption of deep learning techniques within smartwatch designs moving forward

    DeepEar: Robust smartphone audio sensing in unconstrained acoustic environments using deep learning

    Get PDF
    Microphones are remarkably powerful sensors of human behavior and context. However, audio sensing is highly susceptible to wild fluctuations in accuracy when used in diverse acoustic environments (such as, bedrooms, vehicles, or cafes), that users encounter on a daily basis. Towards addressing this challenge, we turn to the field of deep learning; an area of machine learning that has radically changed related audio modeling domains like speech recognition. In this paper, we present DeepEar – the first mobile audio sensing framework built from coupled Deep Neural Networks (DNNs) that simultaneously perform common audio sensing tasks. We train DeepEar with a large-scale dataset including unlabeled data from 168 place visits. The resulting learned model, involving 2.3M parameters, enables DeepEar to significantly increase inference robustness to background noise beyond conventional approaches present in mobile devices. Finally, we show DeepEar is feasible for smartphones by building a cloud-free DSP-based prototype that runs continuously, using only 6% of the smartphone’s battery dailyThis is the author accepted manuscript. The final version is available from ACM via http://dx.doi.org/10.1145/2750858.280426

    Engagement-aware computing: Modelling user engagement from mobile contexts

    Get PDF
    In this paper, we examine the potential of using mobile context to model user engagement. Taking an experimental approach, we systematically explore the dynamics of user engagement with a smartphone through three different studies. Specifically, to understand the feasibility of detecting user engagement from mobile context, we first assess an EEG artifact with 10 users and observe a strong correlation between automatically detected engagement scores and user's subjective perception of engagement. Grounded on this result, we model a set of application level features derived from smartphone usage of 10 users to detect engagement of a usage session using a Random Forest classifier. Finally, we apply this model to train a variety of contextual factors acquired from smartphone usage logs of 130 users to predict user engagement using an SVM classifier with a F1-Score of 0.82. Our experimental results highlight the potential of mobile contexts in designing engagement-aware applications and provide guidance to future explorations

    Unsupervised domain adaptation under label space mismatch for speech classification

    Get PDF
    Unsupervised domain adaptation using adversarial learning has shown promise in adapting speech models from a labeled source domain to an unlabeled target domain. However, prior works make a strong assumption that the label spaces of source and target domains are identical, which can be easily violated in real-world conditions. We present AMLS, an end-to-end architecture that performs Adaptation under Mismatched Label Spaces using two weighting schemes to separate shared and private classes in each domain. An evaluation on three speech adaptation tasks, namely gender, microphone, and emotion adaptation, shows that AMLS provides significant accuracy gains over baselines used in speech and vision adaptation tasks. Our contribution paves the way for applying UDA to speech models in unconstrained settings with no assumptions on the source and target label spaces

    HeadScan: A Wearable System for Radio-Based Sensing of Head and Mouth-Related Activities

    Get PDF
    The popularity of wearables continues to rise. However, possible applications, and even their raw functionality are constrained by the types of sensors that are currently available. Accelerometers and gyroscopes struggle to capture complex user activities. Microphones and image sensors are more powerful but capture privacy sensitive information. Physiological sensors are obtrusive to users as they often require skin contact and must be placed at certain body positions to function. In contrast, radio-based sensing uses wireless radio signals to capture movements of different parts of the body, and therefore provides a contactless and privacy-preserving approach to detect and monitor human activities. In this paper, we contribute to the search for new sensing modalities for the next generation of wearable devices by exploring the feasibility of mobile radiobased human activity recognition. We believe radio-based sensing has the potential to fundamentally transform wearables as we currently know them. As the first step to achieve our vision, we have designed and developed HeadScan, a first-of-its-kind wearable for radio-based sensing of a number of human activities that involve head and mouth movements. HeadScan only requires a pair of small antennas placed on the shoulder and collar and one wearable unit worn on the arm or the belt of the user. Head- Scan uses the fine-grained CSI measurements extracted from radio signals and incorporates a novel signal processing pipeline that converts the raw CSI measurements into the targeted human activities. To examine the feasibility and performance of HeadScan, we have collected approximate 50.5 hours data from seven users. Our wide-ranging experiments include comparisons to a conventional skin-contact audio-based sensing approach to tracking the same set of head and mouth-related activities. Our experimental results highlight the enormous potential of our radio-based mobile sensing approach and provide guidance to future explorations

    ZOE: A cloud-less dialog-enabled continuous sensing wearable exploiting heterogeneous computation

    Get PDF
    The wearable revolution, as a mass-market phenomenon, has finally arrived. As a result, the question of how wearables should evolve over the next 5 to 10 years is assuming an increasing level of societal and commercial importance. A range of open design and system questions are emerging, for instance: How can wearables shift from being largely health and fitness focused to tracking a wider range of life events? What will become the dominant methods through which users interact with wearables and consume the data collected? Are wearables destined to be cloud and/or smartphone dependent for their operation? Towards building the critical mass of understanding and experience necessary to tackle such questions, we have designed and implemented ZOE – a match-box sized (49g) collar- or lapel-worn sensor that pushes the boundary of wearables in an important set of new directions. First, ZOE aims to perform multiple deep sensor inferences that span key aspects of everyday life (viz. personal, social and place information) on continuously sensed data; while also offering this data not only within conventional analytics but also through a speech dialog system that is able to answer impromptu casual questions from users. (Am I more stressed this week than normal?) Crucially, and unlike other rich-sensing or dialog supporting wearables, ZOE achieves this without cloud or smartphone support – this has important side-effects for privacy since all user information can remain on the device. Second, ZOE incorporates the latest innovations in system-on-a-chip technology together with a custom daughter-board to realize a three-tier low-power processor hierarchy. We pair this hardware design with software techniques that manage system latency while still allowing ZOE to remain energy efficient (with a typical lifespan of 30 hours), despite its high sensing workload, small form-factor, and need to remain responsive to user dialog requests.This work was supported by Microsoft Research through its PhD Scholarship Program. We would also like to thank the anonymous reviewers and our shepherd, Jeremy Gummeson, for helping us improve the paper.This is the author accepted manuscript. The final version is available from ACM at http://dl.acm.org/citation.cfm?doid=2742647.2742672

    DSP.Ear: Leveraging co-processor support for continuous audio sensing on smartphones

    Get PDF
    The rapidly growing adoption of sensor-enabled smartphones has greatly fueled the proliferation of applications that use phone sensors to monitor user behavior. A central sensor among these is the microphone which enables, for instance, the detection of valence in speech, or the identification of speakers. Deploying multiple of these applications on a mobile device to continuously monitor the audio environment allows for the acquisition of a diverse range of sound-related contextual inferences. However, the cumulative processing burden critically impacts the phone battery. To address this problem, we propose DSP.Ear - an integrated sensing system that takes advantage of the latest low-power DSP co-processor technology in commodity mobile devices to enable the continuous and simultaneous operation of multiple established algorithms that perform complex audio inferences. The system extracts emotions from voice, estimates the number of people in a room, identifies the speakers, and detects commonly found ambient sounds, while critically incurring little overhead to the device battery. This is achieved through a series of pipeline optimizations that allow the computation to remain largely on the DSP. Through detailed evaluation of our prototype implementation we show that, by exploiting a smartphone's co-processor, DSP.Ear achieves a 3 to 7 times increase in the battery lifetime compared to a solution that uses only the phone's main processor. In addition, DSP.Ear is 2 to 3 times more power efficient than a naive DSP solution without optimizations. We further analyze a large-scale dataset from 1320 Android users to show that in about 80-90% of the daily usage instances DSP.Ear is able to sustain a full day of operation (even in the presence of other smartphone workloads) with a single battery charge.This work was supported by Microsoft Research through its PhD Scholarship Program.This is the author's accepted manuscript. The final version is available from ACM in the proceedings of the ACM Conference on Embedded Networked Sensor Systems: http://dl.acm.org/citation.cfm?id=2668349

    More with less: Lowering user burden in mobile crowdsourcing through compressive sensing

    Get PDF
    Mobile crowdsourcing is a powerful tool for collecting data of various types. The primary bottleneck in such systems is the high burden placed on the user who must manually collect sensor data or respond in-situ to simple queries (e.g., experience sampling studies). In this work, we present Compressive CrowdSensing (CCS) - a framework that enables compressive sensing techniques to be applied to mobile crowdsourcing scenarios. CCS enables each user to provide significantly reduced amounts of manually collected data, while still maintaining acceptable levels of overall accuracy for the target crowd-based system. Näive applications of compressive sensing do not work well for common types of crowdsourcing data (e.g., user survey responses) because the necessary correlations that are exploited by a sparsifying base are hidden and non-Trivial to identify. CCS comprises a series of novel techniques that enable such challenges to be overcome. We evaluate CCS with four representative large-scale datasets and find that it is able to outperform standard uses of compressive sensing, as well as conventional approaches to lowering the quantity of user data needed by crowd systems

    BodyScan: Enabling Radio-based Sensing on Wearable Devices for Contactless Activity and Vital Sign Monitoring

    Get PDF
    Wearable devices are increasingly becoming mainstream consumer products carried by millions of consumers. However, the potential impact of these devices is currently constrained by fundamental limitations of their built-in sensors. In this paper, we introduce radio as a new powerful sensing modality for wearable devices and propose to transform radio into a mobile sensor of human activities and vital signs. We present BodyScan, a wearable system that enables radio to act as a single modality capable of providing whole-body continuous sensing of the user. BodyScan overcomes key limitations of existing wearable devices by providing a contactless and privacy-preserving approach to capturing a rich variety of human activities and vital sign information. Our prototype design of BodyScan is comprised of two components: one worn on the hip and the other worn on the wrist, and is inspired by the increasingly prevalent scenario where a user carries a smartphone while also wearing a wristband/smartwatch. This prototype can support daily usage with one single charge per day. Experimental results show that in controlled settings, BodyScan can recognize a diverse set of human activities while also estimating the user's breathing rate with high accuracy. Even in very challenging real-world settings, BodyScan can still infer activities with an average accuracy above 60% and monitor breathing rate information a reasonable amount of time during each day

    Cost-aware compressive sensing for networked sensing systems

    Get PDF
    Compressive Sensing is a technique that can help reduce the sampling rate of sensing tasks. In mobile crowdsensing applications or wireless sensor networks, the resource burden of collecting samples is often a major concern. Therefore, compressive sensing is a promising approach in such scenarios. An implicit assumption underlying compressive sensing - both in theory and its applications - is that every sample has the same cost: its goal is to simply reduce the number of samples while achieving a good recovery accuracy. In many networked sensing systems, however, the cost of obtaining a specific sample may depend highly on the location, time, condition of the device, and many other factors of the sample. In this paper, we study compressive sensing in situations where different samples have different costs, and we seek to find a good trade-off between minimizing the total sample cost and the resulting recovery accuracy. We design CostAware Compressive Sensing (CACS), which incorporates the cost-diversity of samples into the compressive sensing framework, and we apply CACS in networked sensing systems. Technically, we use regularized column sum (RCS) as a predictive metric for recovery accuracy, and use this metric to design an optimization algorithm for finding a least cost randomized sampling scheme with provable recovery bounds. We also show how CACS can be applied in a distributed context. Using traffic monitoring and air pollution as concrete application examples, we evaluate CACS based on large-scale real-life traces. Our results show that CACS achieves significant cost savings, outperforming natural baselines (greedy and random sampling) by up to 4x
    • …
    corecore